Model Selection

GPTQ Quantization

# GPTQ Quantization

Qwen3 Reranker 4B W4A16 G128

This is the result of GPTQ quantization on Qwen/Qwen3-Reranker-4B, significantly reducing VRAM usage.

Large Language Model

Qwen3 Embedding 0.6B W4A16 G128

GPTQ quantized version of Qwen3-Embedding-0.6B, optimized for video memory usage with minimal performance loss

Qwen3 0.6B GPTQ Int8

Qwen3-0.6B is the latest 0.6B-parameter large language model in the Qwen series, supporting switching between thinking and non-thinking modes, with exceptional reasoning, instruction-following, and agent capabilities.

Large Language Model

Qwen3 1.7B GPTQ Int8

Qwen3 is the latest version in the Tongyi Qianwen series of large language models, offering a 1.7B-parameter GPTQ 8-bit quantized model that supports switching between reasoning and non-reasoning modes, enhancing inference capabilities and multilingual support.

Large Language Model

Orpheus 3b 0.1 Ft.w8a8

Orpheus-3B-0.1-FT is a text-to-speech model based on a causal language model, supporting efficient quantization compression.

Large Language Model

Transformers English

Qwen2.5 VL 3B Instruct GPTQ Int4

This is the GPTQ-Int4 quantized version of the Qwen2.5-VL-3B-Instruct model, suitable for multimodal tasks involving image-to-text and text-to-text, supporting both Chinese and English.

Transformers Supports Multiple Languages

Meta Llama 3.1 8B Instruct GPTQ INT4

This is the INT4 quantized version of the Meta-Llama-3.1-8B-Instruct model, quantized using the GPTQ algorithm, suitable for multilingual dialogue scenarios.

Large Language Model

Transformers Supports Multiple Languages

Five Phases Mindset

A QWEN-based AI TCM consultation model integrating Five Elements theory to provide personalized TCM diagnostic services

Large Language Model

Transformers Chinese

Llama 2 13B GPTQ

GPTQ quantized version of Meta's Llama 2 13B model, suitable for efficient inference

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase